feat(opentelemetry-instrumentation-aws-lambda): Add sqs context propagation #2981

wpessers · 2025-08-12T22:17:17Z

Which problem is this PR solving?

#2921 No SQS Context propagation in aws lambda.

Short description of the changes

Used pubsub propagation utils to automatically created processing spans for each sqs record within a lambda sqs event. These spans link to their producer span, as is described in the spec: https://opentelemetry.io/docs/specs/semconv/messaging/messaging-spans/#consumer-spans

wpessers · 2025-08-12T22:18:47Z

Converted this back to draft PR. Some stuff I need to figure out first. Used the sqs instrumentation from the aws sdk instrumentation package as an example but there are a bunch of deprecated semconv attributes.

codecov · 2025-08-13T22:03:58Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 89.85%. Comparing base (cc7eff4) to head (1dca665).
⚠️ Report is 4 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #2981      +/-   ##
==========================================
+ Coverage   89.83%   89.85%   +0.01%     
==========================================
  Files         188      188              
  Lines        9288     9311      +23     
  Branches     1905     1912       +7     
==========================================
+ Hits         8344     8366      +22     
- Misses        944      945       +1

Files with missing lines	Coverage Δ
.../instrumentation-aws-lambda/src/instrumentation.ts	`95.02% <100.00%> (+0.38%)`	⬆️

... and 2 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

wpessers · 2025-08-17T21:59:58Z

@jj22ee Any thoughts on this PR? There's some use of deprecated semconv attributes for the messaging spans. But these only have alternatives that are still incubating currently. So I believe we would prefer the deprecated ones to avoid possibly breaking on minor version updates for these semconv attributes? Also, there's currently still a bunch of separate commits not adhering to the conventional commits standard. Is it expected that I rebase and squash the commits myself? Or is this done eventually on merge?

wpessers · 2025-08-17T22:05:18Z

Also, I tested this by deploying this version of the aws-lambda instrumentation and actually exporting the spans to grafana. Resulting in traces that look like this:
This is the trace of the lambda that is triggered by sqs, thus containing the regular function invocation span as parent. And then a bunch of process spans for each sqs record in this batch. These each have a span link back to the message producer span as you can see.

If we follow such a span link, we end up at the message producer span:

gkaskonas · 2025-08-27T13:10:37Z

Can this be merged, please?

jj22ee · 2025-08-28T18:56:06Z

Hey, planning to take a look this week.

wpessers · 2025-09-01T17:38:27Z

Hi @jj22ee we've discussed this in the FaaS SIG again. The span linking visualization that I showed above for grafana is not something that is generally supported by other observability platforms. Therefore we want to change this PR a bit so that we can make it configurable to either use the current implementation with span linkgs or set the processing spans' parent context to the producer span instead of span linking. Users can then configure themselves how they want the context propagation to work.

Just so people using different observability platforms can also get the benefits of this feature, and the second approach will make this information visible regardless of which platform you export spans to.
Any thoughts?

jj22ee · 2025-09-03T19:53:07Z

Hi @wpessers

There's some use of deprecated semconv attributes for the messaging spans. But these only have alternatives that are still incubating currently. So I believe we would prefer the deprecated ones to avoid possibly breaking on minor version updates for these semconv attributes?

The workaround that is being done in the contrib repo is to just copy the incubating attributes into the package's semconv.ts file. For example, see this incubating attribute for aws-lambda-instrumentation: https://github.com/open-telemetry/opentelemetry-js-contrib/blob/main/packages/instrumentation-aws-lambda/src/semconv.ts

jj22ee · 2025-09-04T00:21:15Z

Regarding the original issue: #2921 (comment)

In aws-lambda instrumentation, we should identify whether an input event contains sqs records. If it does we should do something similar to what the aws-sdk instrumentation is already doing, to link processing spans back to the producing spans by extracting context from the message system attributes.

It was discussed to drop Processing Spans in OTel JS's AWS SDK instrumentation due to implementation limitations and updates to the spec

Tests failing for @aws-sdk/client-sqs >=3.316 #1477 (comment)
- This related PR was merged to AWS SDK Instrumentation recently, that drops processing spans
  - feat(aws-sdk)!: SQS receive: use span links instead of processing spans #2345

However, there is a separate spec specifically for handling SQS in AWS Lambda, which is different from the spec you linked in the description:

https://opentelemetry.io/docs/specs/semconv/faas/aws-lambda/#sqs

This one does state that:

The function invocation span MUST correspond to the SQS event, which is the batch of messages. For each message, an additional span SHOULD be created to correspond with the handling of the SQS message

So this PR does follow the Lambda+SQS Spec to create new process spans for each message.

jj22ee · 2025-09-04T00:52:17Z

I advise you to look at the Lambda+SQS spec because the spec details in there differs from this PR (e.g. required usage of AWS X-Ray Propagator and using message system attributes instead of message attributes to check for AWSTraceHeader to generate the context for span link).

Therefore we want to change this PR a bit so that we can make it configurable to either use the current implementation with span linkgs or set the processing spans' parent context to the producer span instead of span linking. Users can then configure themselves how they want the context propagation to work.

I'd recommend to just start with the initial support for process spans for each message w/ span links since that is part of spec. Do you know if the alternative configuration will also be added to the spec? Sounds like it might be complicated because it still make sense for processing spans' parent to be the Lambda Function Span.

wpessers · 2025-09-04T11:07:22Z

It was discussed to drop Processing Spans in OTel JS's AWS SDK instrumentation due to implementation limitations and updates to the spec

Tests failing for @aws-sdk/client-sqs >=3.316 #1477 (comment)

This related PR was merged to AWS SDK Instrumentation recently, that drops processing spans

feat(aws-sdk)!: SQS receive: use span links instead of processing spans #2345

Oh, I see. This was apparently ongoing right before I opened this PR, but I somehow missed it.

wpessers · 2025-09-04T11:19:33Z

I advise you to look at the Lambda+SQS spec because the spec details in there differs from this PR (e.g. required usage of AWS X-Ray Propagator and using message system attributes instead of message attributes to check for AWSTraceHeader to generate the context for span link).

That makes sense, will read through it and make the necessary changes.

I'd recommend to just start with the initial support for process spans for each message w/ span links since that is part of spec. Do you know if the alternative configuration will also be added to the spec? Sounds like it might be complicated because it still make sense for processing spans' parent to be the Lambda Function Span.

Sounds good, I think we can indeed just start with the implementation adhering to the spec. I don't know if the alternative configuration will be added to the spec but it's probably worth discussing. Agree, it does indeed make sense for the parent to be the Lambda Function span. It's just a shame that this will not be represented properly in a lot of observability platforms. If we can (after the initial implementation) also add an experimental feature to allow end-users to opt-in to the alternative way of creating the processing spans with the producer span as their parent, then we can properly evaluate whether a spec change would be warranted.

jj22ee · 2025-09-05T07:23:49Z

Commenting again about the processing spans created for each message in the SQS batch.

AWS SDK instrumentation dropped process spans because of 2 reasons (context from this comment & this issue):
1. limited ability to create processing spans automatically (e.g. requires a specific for-loop setup to create these spans)
2. New Batch receiving spec doesn't mention creating processing spans, but instead mentions to add all producer span links to the singular consumer span that receives all the producer messages.

Reason (i) still applies in Lambda case. However this isn't the exact case for reason (ii), as the Lambda Spec still mentions "function invocation span MUST correspond to the SQS event, which is the batch of messages. For each message, an additional span SHOULD be created to correspond with the handling of the SQS message" while still acknowledging the limitations: "automatic instrumentation mechanisms without code change will often not be able to instrument the processing of the individual messages".

So I'm wondering if we should still avoid creating these processing spans (deviate from Lambda spec), and instead just add a link from all producer spans to the singular Lambda Function Span, similar to AWS SDK instr today. @trentm do you think we should still drop process spans here, despite the Lambda spec?

wpessers · 2025-09-15T18:26:39Z

It's clear to me what changed in the normal messaging spec and why the changes were made. I'm merely questioning if the decision to go all-out on the span links made sense from an end-user perspective, having noticed the little support there is on visualizing this information in observability platforms. But I suppose it is up to them to implement the spec, not the other way around. I also agree that maybe it makes more sense to align the lambda spec with the general messaging spec.

wpessers · 2025-10-27T13:34:09Z

@jj22ee regarding your comment about using message system attributes and xray exporter. This is again something that is currently in the spec but I'm not sure about the feasibility. I posed a question to semconv maintainers regarding this as well. But all aws sdk instrumentation libs accross different platforms that I know of, currently have implemented context propagation through message attributes. So if we want to use message system attributes in lambda instrumentation, that requires changes in aws sdk instrumentation.

…re in this code

…ll. And fallback to message attribute value if stringValue is undefined

…panDetails

… determine if it is an sqs event

…tions spec for messaging spans

pichlermarc · 2025-12-03T17:22:03Z

Looks like this PR has stalled due to inconsistencies in the https://github.com/open-telemetry/semantic-conventions/ repo. @wpessers @jj22ee would you mind opening an issue there to seek guidance on how to resolve the situation here?

trentm · 2025-12-03T17:23:50Z

FWIW, here is the PR that removed processing spans and usage of @opentelemetry/propagation-utils from the SQS instrumentation in the instrumentation-aws-sdk package: #2345

wpessers · 2025-12-03T17:30:54Z

Yes, I know what to do next based on feedback received earlier. Just needed some time to get this going again. Other issues have taken priority

wpessers requested a review from a team as a code owner August 12, 2025 22:17

github-actions bot assigned jj22ee Aug 12, 2025

github-actions bot requested a review from jj22ee August 12, 2025 22:17

wpessers marked this pull request as draft August 12, 2025 22:17

wpessers force-pushed the feat/instrumentation-aws-lambda/sqs-context-propagation branch from e0722a0 to 69c3ef1 Compare August 13, 2025 21:49

github-actions bot added the pkg:instrumentation-aws-lambda label Aug 13, 2025

wpessers mentioned this pull request Aug 16, 2025

Distributed traces across AWS services are not connected open-telemetry/opentelemetry-lambda#1787

Open

wpessers marked this pull request as ready for review August 17, 2025 21:58

wpessers added 7 commits November 20, 2025 13:57

Propagate pubsub context when lambda event is an sqs event

eaf22c1

Rename loop var, underscores are not used to mark unused vars elsewhe…

0d97193

…re in this code

Add sqs handler test for promise style async handler

5b4c137

Return empty array from sqsContextGetter keys method if carrier is nu…

ee28e44

…ll. And fallback to message attribute value if stringValue is undefined

Type message param as SQSRecord in anonymous function used to build S…

39a8908

…panDetails

Refactor and format lambda handler tests

2d26a05

Add propagation-utils dependency in instrumentation-aws-lambda package

ee602fc

wpessers added 4 commits November 20, 2025 14:31

Update root level lockfile

543e484

Look at first record eventSource in aws lambda event Records array to…

3442633

… determine if it is an sqs event

Remove span attributes that are no longer part of the semantic conven…

edb6895

…tions spec for messaging spans

Test sqsContextGetter keys method

3b539be

wpessers force-pushed the feat/instrumentation-aws-lambda/sqs-context-propagation branch from 1dca665 to 3b539be Compare November 22, 2025 17:17

feat(opentelemetry-instrumentation-aws-lambda): Add sqs context propagation #2981

Are you sure you want to change the base?

feat(opentelemetry-instrumentation-aws-lambda): Add sqs context propagation #2981

Uh oh!

Conversation

wpessers commented Aug 12, 2025

Which problem is this PR solving?

Short description of the changes

Uh oh!

wpessers commented Aug 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented Aug 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

wpessers commented Aug 17, 2025

Uh oh!

wpessers commented Aug 17, 2025

Uh oh!

gkaskonas commented Aug 27, 2025

Uh oh!

jj22ee commented Aug 28, 2025

Uh oh!

wpessers commented Sep 1, 2025

Uh oh!

jj22ee commented Sep 3, 2025

Uh oh!

jj22ee commented Sep 4, 2025

Uh oh!

jj22ee commented Sep 4, 2025

Uh oh!

wpessers commented Sep 4, 2025

Uh oh!

wpessers commented Sep 4, 2025

Uh oh!

jj22ee commented Sep 5, 2025

Uh oh!

wpessers commented Sep 15, 2025

Uh oh!

wpessers commented Oct 27, 2025

Uh oh!

pichlermarc commented Dec 3, 2025

Uh oh!

trentm commented Dec 3, 2025

Uh oh!

wpessers commented Dec 3, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

wpessers commented Aug 12, 2025 •

edited

Loading

codecov bot commented Aug 13, 2025 •

edited

Loading